An Exploration of the Linguistic Knowledge for Semantic Relation Extraction in Spanish
نویسندگان
چکیده
A common strategy for Question Answering systems uses high quality ontologies or databases in order to efficiently answer questions. Some approaches to build or enrich these databases rely on machine learning classifiers for obtaining semantically related terms from unstructured text. These classifiers are based on features that may contain several kinds of linguistic knowledge: from orthographic or lexical information to more complex features, including PoS-tags, syntactic dependencies or semantic information. In this paper we select four main types of linguistic features and systematically evaluate their performance on semantic Relation Extraction. Although the combination of some types of linguistic features allows us to improve the f-score of the classifiers, we observed that by adjusting the positive/negative ratio of the training examples, we can build high quality classifiers with just a single type of linguistic feature, based on generic lexico-syntactic patterns. Experiments were carried out with the Spanish version of Wikipedia.
منابع مشابه
Semantic Relation Extraction. Resources, Tools and Strategies
Relation extraction is a subtask of information extraction that aims at obtaining instances of semantic relations present in texts. This information can be arranged in machine-readable formats, useful for several applications that need structured semantic knowledge. The work presented in this paper explores different strategies to automate the extraction of semantic relations from texts in Port...
متن کاملPresenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کاملExtraction of Drug-Drug Interaction from Literature through Detecting Linguistic-based Negation and Clause Dependency
Extracting biomedical relations such as drug-drug interaction (DDI) from text is an important task in biomedical NLP. Due to the large number of complex sentences in biomedical literature, researchers have employed some sentence simplification techniques to improve the performance of the relation extraction methods. However, due to difficulty of the task, there is no noteworthy improvement in t...
متن کاملA New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model
Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...
متن کاملA Weakly-Supervised Rule-Based Approach for Relation Extraction
Resumen Rule-based approaches for information extraction usually achieve good precision values, even if they often need a lot of manual effort to be implemented. In this paper, we present a novel rule-based strategy for semantic relation extraction that takes advantage of partial syntactic parsing in order to simplify the linguistic structures containing instances of semantic relations. We also...
متن کامل